COVID-19 and Government Stringency¶

Is government intervention effective in controlling pandemics?¶

COVID-19 has become the largest pandemic of the modern world. It has had countless social, economic, and political consequences since it first appeared in December of 2019. Then, following the WHO's announcement of a global pandemic on March 11th, 2020, many governments around the world began to implement travel restrictions, mask policies, and social distancing guidelines. Now in May of 2021 with 1.7 Billion vaccinated, the question remains, were these efforts effective and how can we learn from this experience to both continue our response in the future and prep for any further pandemics that may be just now propping up on the horizon.

In [ ]:
import sys
sys.path.insert(1, '../code')
import data_munging
from data_munging import *
As of 2021, the total number of cases reported is 168040871 with the total number of deaths being 3494758
This gives us an approximate mortality rate of 2.0797071445791304% (NOTE: this percentage does not take into account time of infection and therefore is likely not fully accurate)

Primary Question of Interest:¶

  • Has government intervention / response been effective at slowing or mitigating the spread of COVID-19?

Follow-up Questions of Interest:¶

  • If government intervention is effective, what types of interventions are the most effective?
  • Does government intervention work in some countries but not others? What are the potential factors that may influence this circumstance?
  • More to be added as project is developed (TODO)

Data Sources:¶

  • Containment and Health Index (Time-Series): https://ourworldindata.org/grapher/covid-containment-and-health-index
  • Confirmed and Cumulative Cases (Time-Series): https://covid19.who.int/WHO-COVID-19-global-data.csv
  • Population Statistics from World Bank (Time-Series): https://data.worldbank.org/indicator/SP.POP.TOTL
  • Government Effectiveness Index (Time-Series): https://datacatalog.worldbank.org/dataset/worldwide-governance-indicators

Descriptive Analysis¶

WHO's COVID-19 Data¶

We will start our analysis by first taking a look at the WHO's aggregate global COVID-19 data. This data is presented in the long format and contains several key variables. These include:

  • 'Date_reported': The date on which the WHO received that specific row's statistics
  • 'Country': The country where these statistics were gathered
  • 'Country_code': The two or three letter code used to identify the row's given country
  • 'WHO_region': The region of the world the country is found in.
  • 'New_cases': The number of new cases of COVID-19 reported for that day
  • 'Cumulative_cases': The sum of all 'New_cases' for that country up until the row's specific date
  • 'New_deaths': The number of deaths from COVID-19 reported for that day.
  • 'Cumulative_deaths': The sum of all 'New_deaths' for that country up until the row's specific date

Given that we are only focused on country-level analysis for this report and given that we already are given each country's full name. We do not need to maintain either 'Country_code' or 'WHO_region' and thus we will be dropping them from the dataset to improve readability and reduce clutter.

In [5]:
print("As of 2021, the total number of cases reported is {} with the total number of deaths being {}"
     .format(total_num_cases, total_num_deaths))
print("This gives us an approximate mortality rate of {}% (NOTE: this percentage does not take into account time of infection and therefore is likely not fully accurate)"
     .format(mortality_rate*100))
As of 2021, the total number of cases reported is 168040871 with the total number of deaths being 3494758
This gives us an approximate mortality rate of 2.0797071445791304% (NOTE: this percentage does not take into account time of infection and therefore is likely not fully accurate)

The dataset above contains information on each country up until 05/27/2021. Summing the cumulative cases and deaths from each country, we find out that, by 05/27/2021, there have been approximately 168,040,871 cases worldwide with 3,494,758 total reported deaths. It then becomes clear to see how destructive COVID-19 has been in this past year. To get a better sense of its spread, let's take a look at the number of new cases from the top five "most-infected" countries day by day.

In [7]:
(p9.ggplot(WHO_covid_top_five, p9.aes(x='Date_reported', y='New_cases', group='Country', color='Country')) +
    p9.geom_smooth(span=.05, se=False, alpha = 1) 
 + p9.geom_point(alpha = 0.1)
 + p9.theme_538()
 + p9.xlab('Date')
 + p9.ylab('New Cases per Day')
 + p9.scale_x_datetime(date_breaks='2 month')
 + p9.theme(axis_text_x=p9.element_text(rotation=45, hjust=1))
)
Out[7]:
<ggplot: (8775417317313)>

Prior to March 2020, it appears as though there were few cases if any in the above countries. However, starting sometime in the middle of that month, cases began to rise sharply in the United States followed slightly by both Brazil at the end of April and Turkey at the beginning of April. From there on, cases in the US appeared to be following an almost cyclical pattern, where the number of new cases would steadily decrease for about a month or two before spiking once again, each time higher than the last. The United States reached its zenith of new cases per day in the early days of January 2021 with nearly 310,000 cases being reported in a single day. Following this, however, cases in the US have since been on a downward trajectory, seemingly breaking the trend described early. Perhaps this is due to vaccination efforts but no conclusions can be drawn just yet.

Brazil, France, and Turkey, on the other hand, have seen cases steadily increasing since the early months of 2020. France and Turkey, however, have recently seen a decline in more recent cases while Brazil has not.

But by far the most troubling trend on this above graph comes from India, where the number of new cases spiked in the middle of September 2020 but then quickly declined heading into 2021. At this point, officials in India began to declare an early victory against COVID-19 only to have such hopes dashed beginning in March of 2021. At this point, the number of new cases in the country rapidly increased, even faster than they had in the US during November of the previous year. India reached a new peak in the number of new cases around the beginning of May 2021. Thankfully, this peak appears to be short-lived, with cases following it having rapidly declined towards the end of May.

Let's now look at how 'New_deaths' lines up with these increases and decreases in cases so that we can get a better sense of how COVID-19 infections progress.

In [8]:
(p9.ggplot(WHO_covid_top_five, p9.aes(x='Date_reported', y='New_deaths', group='Country', color='Country')) +
    p9.geom_smooth(span=.05, se=False, alpha = 1) 
 + p9.geom_point(alpha = 0.1)
 + p9.theme_538()
 + p9.xlab('Date')
 + p9.ylab('New Deaths per Day')
 + p9.scale_x_datetime(date_breaks='2 month')
 + p9.theme(axis_text_x=p9.element_text(rotation=45, hjust=1))
)
Out[8]:
<ggplot: (8775410258164)>

As we can see in the above graph, the number of new deaths per day appears to be closely linked with that of the number of new cases per day displayed earlier. And, from the US's peak around January of 2021, we can see that deaths appear to lag roughly by two or so weeks. However, there appear to be a few oddities in the graph above. The biggest one being the US's strange spike in new deaths around April of 2020. Although, the US had relatively few cases at this point as compared to the US's second spike in deaths around January of 2021, the number of new deaths in the case is roughly half of that of the second spike. Perhaps, US hospitals became better at treating COVID-19 infections, but this does not appear to be the case as similar spikes did not also occur in any of the other top five "most-infected" countries. Regardless, let's now move on to looking at what government did during this time to stop the spread of COVID-19.

Oxford's Health and Containment Index¶

Oxford's containment and health index tracks government responses on 13 different policy decisions / variables including:

  • School closures
  • Workplace closures
  • Cancel public events
  • Restrictions on gatherings
  • Close public transport
  • Public information campaigns
  • Stay-at-home orders
  • Restrictions on internal movement
  • International travel controls
  • Testing policy
  • Contract tracing
  • Face coverings
  • Vaccination policy

They are all ordinal variables and have values ranging from 0 to some other number constituting a specific degree of 'stringency' or strictness. These values are then aggregated into one collective value which the Oxford team refers to as a state's health and containment index which provides a standardized score detailing the overall stringency of a government's response.

Just as we did for the WHO's dataset above, let's take a look at the "most-infected" countries' health and containment indicies over time.

In [81]:
(p9.ggplot(containment_index_top_five, p9.aes(x='Date_reported', y='containment_index', group='Country', color='Country')) +
    p9.geom_line()
 + p9.theme_538()
 + p9.xlab('Date')
 + p9.ylab('Health and Containment Index')
 + p9.scale_x_datetime(date_breaks='2 month')
 + p9.theme(axis_text_x=p9.element_text(rotation=45, hjust=1))
)
Out[81]:
<ggplot: (8775291471261)>

One general trend stands out from the rest in the above plot and that is, starting in late February of 2020, these five countries began to rapidly implement policies intended to either help slow or prevent the spread of COVID-19 entirely. This pattern of response, however, was not just unique to these five countries, as shown below, it was a pattern shared by much of the world.

In [82]:
(p9.ggplot(containment_index, p9.aes(x='containment_index')) +
    p9.geom_histogram(binwidth=1, fill='green', alpha=.7)
 + p9.theme_538()
 + p9.xlab('Containment Index')
 + p9.ggtitle('Containment Index Values from 01/01/2020 to 05/26/2021')
)
Out[82]:
<ggplot: (8775291624695)>

The above histogram details the values of health and containment indicies from countries around the world starting on 01/01/2020. As one can clearly see, there is a particular value that stands out from the rest; that value being 0. For much of January and February in 2020, COVID-19 was not yet viewed as a global threat, most cases were in China and many underestimated the virus's ability and potential harm. This is likely the reason why so many 0s exist in the above distribution, a theory that is supported by the following histogram:

In [78]:
(p9.ggplot(containment_index_post_ann, p9.aes(x='containment_index')) +
    p9.geom_histogram(binwidth=1, fill='green', alpha=.7)
 + p9.theme_538()
 + p9.xlab('Containment Index')
 + p9.ggtitle('Containment Index Values Following WHO Announcement')
)
Out[78]:
<ggplot: (8775292654867)>

The above histogram displays the distribution of health and containment indicies following the WHO's announcement on 03/11/2020, whereby the organization declared COVID-19 to be a global pandemic. As we can see, there exist few values below 10 at this point as most countries began implementing policies intended to help combat the growing outbreak. Seeing as we are interested in how effective government response was, we will be excluding the data from prior to the WHO's announcement. In this way, we can protect our data from being overly biased by what was essentially a "nonresponse" by many countries in both January and early-February.

Data Merging¶

Another thing, however, we have to keep in mind when dealing with an infectious agent is that any type of government response whether that be a mask mandate or travel restrictions will take time to take effect. In the worst case, COVID-19 infections can progress as follows:

  • Incubation Period: Up to 14 days (https://www.cdc.gov/coronavirus/2019-ncov/hcp/clinical-guidance-management-patients.html)
  • From Symptom Onset to Hospitalization: Up to 12.4 days (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7589278/)
  • Maximum amount of time between initial infection and hospitalization: 26.4 (Truncated to 26)

This means that COVID-19 infections can last up to about a month or more if symptoms are severe. And, if testing is not widespread, it also means that a person who contracted the virus a month ago might not get counted until this period of time is up. Therefore, we cannot simply match the two datasets described above directly because the policies of one day do not affect the cases or deaths for that day, they affect the ones for about a month down the line. With this in mind, we will be adding a new column to the WHO's dataset:

  • 'Date_of_Impacting_Restrictions': The date when the COVID-19 policies capable of impacting the the cases for ths given 'Date_reported' observation were implemented (26 days prior).

With our dates now lined up, we can start looking into the other variable we will be using for merging the above two datasets, 'Country'. One interesting thing that can be seen in the two visualizations above, before the histograms, is that 'United States of America' is not listed as a country in the Oxford's dataset. Instead, the country is represented under a slightly different label, 'United States.' This suggests that there may be other countries in these two datasets that are labelled differently as well. Before we can effectively merge the two datasets we will have to address this issue. Let's first take a look at how many countries are represented between the two datasets.

In [83]:
print("The WHO dataset contains {} countries while the Oxford dataset only contains {} countries"
      .format(WHO_country_list.shape[0], ci_country_list.shape[0]))
The WHO dataset contains 236 countries while the Oxford dataset only contains 185 countries

By utilizing pandas' 'unique()' function, we learn that the WHO's global COVID-19 data has information on nearly 237 countries / terrorities while Oxford's containment and health index only has information on roughly 185 countries / terrorities. To avoid missingness we will likely have to make our final dataset conform with that of Oxford's. But first let's check which differences exist between them and make sure no countries are accidentally left out due to differences in naming.

In [17]:
print(sorted(list(set(WHO_country_list) - set(ci_country_list)) + list(set(ci_country_list) - set(WHO_country_list))))
['American Samoa', 'Anguilla', 'Antigua and Barbuda', 'Armenia', 'Bolivia', 'Bolivia (Plurinational State of)', 'Bonaire', 'British Virgin Islands', 'Brunei', 'Brunei Darussalam', 'Cabo Verde', 'Cape Verde', 'Cayman Islands', 'Comoros', 'Cook Islands', "Cote d'Ivoire", 'Curaçao', 'Côte d’Ivoire', "Democratic People's Republic of Korea", 'Democratic Republic of Congo', 'Democratic Republic of the Congo', 'Equatorial Guinea', 'Faeroe Islands', 'Falkland Islands (Malvinas)', 'Faroe Islands', 'French Guiana', 'French Polynesia', 'Gibraltar', 'Grenada', 'Guadeloupe', 'Guernsey', 'Guinea-Bissau', 'Holy See', 'Hong Kong', 'Iran', 'Iran (Islamic Republic of)', 'Isle of Man', 'Jersey', 'Kosovo', 'Kosovo[1]', "Lao People's Democratic Republic", 'Laos', 'Macao', 'Maldives', 'Marshall Islands', 'Martinique', 'Mayotte', 'Micronesia (Federated States of)', 'Moldova', 'Montenegro', 'Montserrat', 'Nauru', 'New Caledonia', 'Niue', 'North Macedonia', 'Northern Mariana Islands (Commonwealth of the)', 'Other', 'Palau', 'Palestine', 'Pitcairn Islands', 'Republic of Korea', 'Republic of Moldova', 'Russia', 'Russian Federation', 'Réunion', 'Saba', 'Saint Barthélemy', 'Saint Helena', 'Saint Kitts and Nevis', 'Saint Lucia', 'Saint Martin', 'Saint Pierre and Miquelon', 'Saint Vincent and the Grenadines', 'Samoa', 'Sao Tome and Principe', 'Sint Eustatius', 'Sint Maarten', 'South Korea', 'Syria', 'Syrian Arab Republic', 'Taiwan', 'Tanzania', 'The United Kingdom', 'Timor', 'Timor-Leste', 'Tokelau', 'Turks and Caicos Islands', 'Tuvalu', 'United Kingdom', 'United Republic of Tanzania', 'United States', 'United States of America', 'Venezuela', 'Venezuela (Bolivarian Republic of)', 'Viet Nam', 'Vietnam', 'Wallis and Futuna', 'occupied Palestinian territory, including east Jerusalem']

As suspected, there are quite a few countries listed above that are listed as differences solely due to how they are being named. Some examples include 'Viet Nam' vs 'Vietnam' and 'United States' vs 'United States of America.' We will use a 'fuzzy_left_join()' to fix these issues. Seeing as the Oxford dataset is the one with few countries, it will also be the one being joined to (i.e. the left dataset). This way we can minimize the amount of missingness present in the final result.

Having merged the two datasets together and done some last minute data munging, we are finally ready to start looking into some of the other pieces of data we will need for this analysis, starting with country population. We will be using 2019 data from the World Bank. This data will later be used to scale back both new and cumulative cases so that the larger countries do not simply overbias our analyis. We will start by reading it in as a csv file.

As with the previous two datasets, the names of countries between the World Bank's and our new merged dataset differ slightly. We will take care of this with another 'fuzzy_left_join().' Again, seeing as our merged dataset is the limited factor we will assign it to be the left dataset in the join.

In [25]:
print(sorted(list(set(merged_country_list) - set(pop_country_list)) + list(set(pop_country_list) - set(merged_country_list))))
['American Samoa', 'Antigua and Barbuda', 'Arab World', 'Armenia', 'Bahamas', 'Bahamas, The', 'British Virgin Islands', 'Brunei', 'Brunei Darussalam', 'Cabo Verde', 'Cape Verde', 'Caribbean small states', 'Cayman Islands', 'Central Europe and the Baltics', 'Channel Islands', 'Comoros', 'Congo', 'Congo, Dem. Rep.', 'Congo, Rep.', 'Curacao', 'Czech Republic', 'Czechia', 'Democratic Republic of Congo', 'Early-demographic dividend', 'East Asia & Pacific', 'East Asia & Pacific (IDA & IBRD countries)', 'East Asia & Pacific (excluding high income)', 'Egypt', 'Egypt, Arab Rep.', 'Equatorial Guinea', 'Euro area', 'Europe & Central Asia', 'Europe & Central Asia (IDA & IBRD countries)', 'Europe & Central Asia (excluding high income)', 'European Union', 'Faeroe Islands', 'Faroe Islands', 'Fragile and conflict affected situations', 'French Polynesia', 'Gambia', 'Gambia, The', 'Gibraltar', 'Grenada', 'Guinea-Bissau', 'Heavily indebted poor countries (HIPC)', 'High income', 'Hong Kong', 'Hong Kong SAR, China', 'IBRD only', 'IDA & IBRD total', 'IDA blend', 'IDA only', 'IDA total', 'Iran', 'Iran, Islamic Rep.', 'Isle of Man', 'Korea, Dem. People’s Rep.', 'Korea, Rep.', 'Kyrgyz Republic', 'Kyrgyzstan', 'Lao PDR', 'Laos', 'Late-demographic dividend', 'Latin America & Caribbean', 'Latin America & Caribbean (excluding high income)', 'Latin America & the Caribbean (IDA & IBRD countries)', 'Least developed countries: UN classification', 'Low & middle income', 'Low income', 'Lower middle income', 'Macao', 'Macao SAR, China', 'Maldives', 'Marshall Islands', 'Micronesia, Fed. Sts.', 'Middle East & North Africa', 'Middle East & North Africa (IDA & IBRD countries)', 'Middle East & North Africa (excluding high income)', 'Middle income', 'Montenegro', 'Nauru', 'New Caledonia', 'North America', 'North Macedonia', 'Northern Mariana Islands', 'Not classified', 'OECD members', 'Other small states', 'Pacific island small states', 'Palau', 'Palestine', 'Post-demographic dividend', 'Pre-demographic dividend', 'Russia', 'Russian Federation', 'Samoa', 'Sao Tome and Principe', 'Sint Maarten (Dutch part)', 'Slovak Republic', 'Slovakia', 'Small states', 'South Asia', 'South Asia (IDA & IBRD)', 'South Korea', 'St. Kitts and Nevis', 'St. Lucia', 'St. Martin (French part)', 'St. Vincent and the Grenadines', 'Sub-Saharan Africa', 'Sub-Saharan Africa (IDA & IBRD countries)', 'Sub-Saharan Africa (excluding high income)', 'Syria', 'Syrian Arab Republic', 'Taiwan', 'Timor', 'Timor-Leste', 'Turks and Caicos Islands', 'Tuvalu', 'United States Virgin Islands', 'Upper middle income', 'Venezuela', 'Venezuela, RB', 'Virgin Islands (U.S.)', 'West Bank and Gaza', 'World', 'Yemen', 'Yemen, Rep.']
In [85]:
merge_ci_WHO.shape
Out[85]:
(71918, 14)

After this last merge, we are left with a dataframe that boasts 71918 observations and 14 variables. We will now again proceed by adding in a measure of government effectiveness to the dataframe. This measure comes from the World Bank as well and can be found in the links at the top of this page. And then after some serious data munging, we can actually start to look at what the distribution this government effectiveness indicator looks like:

In [30]:
(p9.ggplot(gov_eff_ind, p9.aes(x='Government Effectiveness: Estimate'))
 + p9.geom_density(fill='dodgerblue', color='dodgerblue', alpha = 0.2)
 + p9.theme_538()
)
Out[30]:
<ggplot: (8775325136833)>

The above distribution appears to be approximately normal with a slight deviation to the left. For our purposes, we will likely only be utilizing the countries that boast an estimated effectiveness above 0 or within the 50th percentile. For now, we can proceed as before with again another 'fuzzy_left_join().'

In [31]:
print(sorted(list(set(merged_country_list) - set(gov_eff_country_list)) + list(set(gov_eff_country_list) - set(merged_country_list))))
['American Samoa', 'Anguilla', 'Antigua and Barbuda', 'Armenia', 'Bahamas', 'Bahamas, The', 'Brunei', 'Brunei Darussalam', 'Cabo Verde', 'Cape Verde', 'Cayman Islands', 'Comoros', 'Congo', 'Congo, Dem. Rep.', 'Congo, Rep.', 'Czech Republic', 'Czechia', 'Democratic Republic of Congo', 'Egypt', 'Egypt, Arab Rep.', 'Equatorial Guinea', 'Faeroe Islands', 'French Guiana', 'Gambia', 'Gambia, The', 'Grenada', 'Guinea-Bissau', 'Hong Kong', 'Hong Kong SAR, China', 'Iran', 'Iran, Islamic Rep.', 'Jersey, Channel Islands', 'Korea, Dem. People’s Rep.', 'Korea, Rep.', 'Kyrgyz Republic', 'Kyrgyzstan', 'Lao PDR', 'Laos', 'Macao', 'Macao SAR, China', 'Maldives', 'Marshall Islands', 'Martinique', 'Micronesia, Fed. Sts.', 'Monaco', 'Montenegro', 'Nauru', 'North Macedonia', 'Palau', 'Palestine', 'Reunion', 'Russia', 'Russian Federation', 'Samoa', 'San Marino', 'Sao Tome and Principe', 'Slovak Republic', 'Slovakia', 'South Korea', 'St. Kitts and Nevis', 'St. Lucia', 'St. Vincent and the Grenadines', 'Syria', 'Syrian Arab Republic', 'Taiwan', 'Taiwan, China', 'Timor', 'Timor-Leste', 'Tuvalu', 'United States Virgin Islands', 'Venezuela', 'Venezuela, RB', 'Virgin Islands (U.S.)', 'West Bank and Gaza', 'Yemen', 'Yemen, Rep.']

After all that has been done, we can now start to look our desired dataframe in its entirety and begin some basic inferential analysis.

Inferential Analysis¶

The graph below compares government effectiveness (percentiles) to the average containment index values of the given country they describe.

In [34]:
fig = px.scatter(avg_con_ind, x='Gov_Eff_Per', y='avg(containment_index)',
                 color='avg(containment_index)', hover_name='Country', trendline='lowess')
fig.show()

As we can see, it appears as though there may exist a slight upward trend where effective governments are more likely to implement greater restrictions when it comes to COVID-19. This trendline is weak, however, and so this association may just simply be coincidental. Let's take a look at how government effectiveness is tied to cumulative cases (scaled by population).

In [36]:
fig = px.scatter(avg_cc_by_gov_eff, x='Gov_Eff_Per', y='Cumulative_cases_pop_scaled',
                 color='Cumulative_cases_pop_scaled', hover_name='Country', trendline='lowess')
fig.show()

It seems as though more effective governments have more cases, and, while this may seem strange at first, this makes sense in a way. More effective governments likely have better testing policies than their weaker counterparts, and thus, are more likely to have actual accurate COVID-19 statistics.

However, because we are interested in whether or not government response is actually effective in pandemic situations, we will be cutting out all countries currently suffering under a weak government. This is because, a weak government that is already ineffective in day-to-day operations will likely also be ineffective at handling a pandemic. We want to see what happens when a government that can respond, does. The cut-off chosen, as stated previously, is at the 50th percentile mark. Visualizing these alone, we can start to get a sense of how containment policies affect both cumulative and new cases.

In [44]:
fig = px.scatter(eff_gov, x='containment_index', y='Cumulative_cases_pop_scaled', 
                 animation_frame='Date_reported',
                 labels = {'containment_index':'Containment Index', 
                           'Cumulative_cases_pop_scaled':'Percent of Population Infected'},
                 color = "Country", size = "Population", size_max=50,
                 range_x=[0,100], range_y=[-2,12],
                 animation_group="Country", hover_name='Country')
fig.show()
In [ ]:
 
In [49]:
fig = px.scatter(eff_gov, x='containment_index', y='New_cases_pop_scaled', 
                 animation_frame='Date_reported',
                 labels = {'containment_index':'Containment Index', 
                           'New_cases_pop_scaled':'New Percent of Population Infected'},
                 color = "Country", size = "Population", size_max=50,
                 range_x=[0,100], range_y=[-0.05,0.15],
                 animation_group="Country", hover_name='Country')
fig.show()

The above graphs do not appear to show any strong indications of whether or not containment policies are working. They are being implemented more and more as time goes on but both cumulative cases and new cases still appear to be rising in the majority of the countries shown above.

Because of this, let's run an OLS regression on new cases (scaled by population) using the health and containment index as our explanatory variable:

In [86]:
ols_result.summary()
Out[86]:
OLS Regression Results
Dep. Variable: New_cases_pop_scaled R-squared: 0.036
Model: OLS Adj. R-squared: 0.036
Method: Least Squares F-statistic: 2580.
Date: Sun, 06 Jun 2021 Prob (F-statistic): 0.00
Time: 22:25:06 Log-Likelihood: 1.8322e+05
No. Observations: 68400 AIC: -3.664e+05
Df Residuals: 68398 BIC: -3.664e+05
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
Intercept -0.0016 0.000 -8.474 0.000 -0.002 -0.001
containment_index 0.0002 3.28e-06 50.797 0.000 0.000 0.000
Omnibus: 50193.977 Durbin-Watson: 0.560
Prob(Omnibus): 0.000 Jarque-Bera (JB): 246022864.587
Skew: 1.847 Prob(JB): 0.00
Kurtosis: 296.786 Cond. No. 164.


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

It appears as though the health containment index is associated with a change in the number of new cases a country sees day by day but not in the way one would expect. According to the above OLS regression, an increase in the containment index by a value of 1 correlates to an increase of 0.0002% of a country's population becoming infected. For reference, in terms of the United States, that's an increase of about 65,640 cases per day. Before drawing any conclusions, however, let us first check these results by tracking the two variable's pearson correlation over time:

In [70]:
px.line(pearson_corr_df, y="pearson_corr", title='Pearson Correlation between New Cases (Scaled) and Containment Index')

In a result similar to the OLS regression, rising containment indices appear to be correlated with rising numbers of new cases. Only a few times is this the opposite and for all of them the correlation is no less than -0.15.

Conclusion¶

It appears as though rising case numbers are associated with an increasing level of government stringency. This is likely due to the fact that a country that has already seen a few thousand or so cases of COVID-19 likely does not have the means to stop it all at once and in response will raise restrictions. It does not mean that restrictions are causing cases to rise, it just means that in some cases those restrictions may be too late or simply ineffective.

However, without a proper control to observe what would have happened in the absence of government intervention, it is hard to tell whether or not it was ineffective. It is likely, in fact, that government intervention did mitigate the effects of COVID-19 at least to a degree. Yet, ultimately our results are inconclusive. A further study is needed with perhaps data from simulations detailing the absence of government response.

In [ ]: